[R] how to identify record with broken format
Duncan Murdoch
murdoch@dunc@n @end|ng |rom gm@||@com
Wed Jun 5 12:23:26 CEST 2019
On 05/06/2019 6:12 a.m., Luigi Marongiu wrote:
> Dear all,
> I have a large dataframe where one of the records in a column must
> have been wrongly formatted, in particular i think is missing a
> closing ".
> When I try to show only that column's value I get a [1] with plenty of
> empty space, the final record [45] and the system freezes. also, when
> i try to plot i get a table's printout instead of a real plot.
>
> Is there a way to identify the record with the format? On a
> spreadsheet or text editor, all records seem OK; end there are too
> many records to visually inspect them all.
>
Without seeing the data it is hard to be specific, but the
count.fields() function should normally return the same number of fields
for every line. You may need to specify some of its optional arguments,
e.g. sep="," for a CSV file, etc.
For example, with this file:
1,2,3
1,2,"4"
1,2,"
1,2,5
1,2,"6"
I see
> count.fields("~/temp/test.txt",sep=",")
[1] 3 3 NA NA NA 3
indicating that there are problems on lines 3-5 (a missing closing quote
on line 3).
Duncan Murdoch
More information about the R-help
mailing list