[R] Very slow read.table on Linux, compared to Win2000

davidek at zla-ryba.cz davidek at zla-ryba.cz
Tue Jun 27 18:07:22 CEST 2006

Dear all,

I read.table a 17MB tabulator separated table with 483 variables(mostly numeric) and 15000 
observations into R. This takes a few seconds with R 2.3.1 on windows 2000, but it takes 
several minutes on my Linux machine. The linux machine is Ubuntu 6.06, 256 MR RAM,
Athlon 1600 processor. The windows hardware is better (Pentium 4, 512 RAM), but it
shouldn't make such a difference. 

The strange thing is that even doing something with the data(say a histogram of a variable, or
integers into a factor)  takes really long time on the linux box and the computer seems to work
extensively with the hard disk. 
Could this be caused by swapping ? Can I increase the memory allocated to R somehow ?
I have checked the manual, but the memory options allowed for linux don't seem to
help me (I may be doing it wrong, though ...)

The code I run:

TBO <- read.table(file="TBO.dat",sep="\t",header=TRUE,dec=",");   # this takes forever
TBO$sexe<-factor(TBO$sexe,labels=c("man","vrouw"));   # even this takes like 30 seconds, compared
to nothing on Win2000

I'd be grateful for any suggestions,

David Vonka

David Vonka (Netspar, Universiteit van Tilburg, room B-623)
CZ: Ovci Hajek 42, Praha 5, Czech Republic, tel: +420777022926 
NL: Telefoonstraat 1, 5038DL Tilburg, The Netherlands, tel:+31638083064

