[R] Reading in a table with ISO-latin1 encoding in MacOS-X (Intel)

Peter Dalgaard p.dalgaard at biostat.ku.dk
Thu Jun 8 16:06:38 CEST 2006

Antti Arppe <aarppe at ling.helsinki.fi> writes:

> Converting the the file from ISO-latin-1 to UTF8 (with Mac's TextEdit
> application)allows the file to be read in in its entirety, but still
> the Scandinavian character in the heading is coerced to a period '.',
> or two, in fact (i.e. 'miettiä' -> 'miett..')

I think you probably need check.names=FALSE. (Presumably, you cannot
have Finnish characters in variable names either on the Mac?)
> Have I possibly misunderstood how the 'file' function should be used
> in conjunction with 'read.table', or might the problem with
> latin1-to-utf conversion be somewhere else?

Not really, text encodings are just a pain. The blame for this fact
can be shifted in various directions, but it doesn't really help...
(My personal angle is that ISO-8859 was terribly shortsighted, and
stuck in a "Western Europe" mindset. As soon as the iron curtain
disappeared and we started to deal with people from Slavic countries,
the weakness was revealed.)

The basic structure looks OK, and works for me on Linux:

> read.table(file("xx.data",encoding="latin1"),header=TRUE)
  æh bøh
1  1   2

so one can only guess that you have a local or Mac-specific setup

   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907

More information about the R-help mailing list