[R] Failing on reading a "slightly big" dataset
Prof Brian Ripley
ripley at stats.ox.ac.uk
Mon Jul 5 12:25:00 CEST 2004
You are asking read.table to interpret both quote and comment characters
in your file. You do seem to have quotes -- are they always matched?
Please read through the Data Import/Export manual and check out all the
options.
On Mon, 5 Jul 2004, Ajay Shah wrote:
> I have a file with 4 columns per line, all pipe delimited.
>
> $ wc -l cmie_firm_data.text
> 89325 cmie_firm_data.text
> $ ls -al cmie_firm_data.text
> -rw-r--r-- 1 ajayshah ajayshah 4415637 Jul 5 15:25 cmie_firm_data.text
> $ awk -F\| '(NF != 4)' cmie_firm_data.text
> $ head cmie_firm_data.text
> All figures are for the year 20030331|||
> Company|GVA Less Interest (Rs. thousand)|Interest (Rs. thousand)|GVA (Rs. thousand)
> 'R' INVEST PVT. LTD.|-510.45|0.18|-510.27
> 20 MICRONS LTD.|60700|41200|101900
> 20TH CENTURY FOX CORPN. (INDIA) PVT. LTD.|50|0.33|50.33
> 21ST CENTURY AUTOMOTIVE INDIA LTD.|201.14|0.19|201.33
> 21ST CENTURY ENTERTAINMENT PVT. LTD.|-6.10|0|-6.10
> 21ST CENTURY EQUIPMENTS PVT. LTD.|-1599.53|1262.76|-336.77
> 21ST CENTURY INFRASTRUCTURE (INDIA) PVT. LTD.|140.48|1.74|142.22
> 21ST CENTURY PEST CONTROL SERVICES LTD.|50.21|7.13|57.34
>
> When I try to read this into R, I get a mysterious error, and then it
> reads only 38,244 observations. Any idea what might be going wrong?
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list