[R] Why Numeric Values Become Factors in Data Frame
Rich Shepard
rshepard at appl-ecosys.com
Tue Nov 29 22:48:55 CET 2011
On Tue, 29 Nov 2011, Rich Shepard wrote:
> Pointers on how to determine why this one variable has some values and
> characters rather than as numerics are needed.
Joshua, Marc, David, Bill, Sarah, Bert, et al.:
Thank you all for the insights and ideas. It was a valuable lesson and it
helped me fix the problem.
Somehow my client had URLs in two data cells of the original Excel
spreadsheet. I removed that in my LibreOffice copy and exported the file as
a .csv. But, I was using a prior version with the cruft still in there when
I read it into R.
Now that I corrected the problem (and fixed mis-entered conductivity
values < 100) the R data frame is correct:
str(waterchem)
'data.frame': 3524 obs. of 39 variables:
$ site : Factor w/ 64 levels "D-1","D-2","D-3",..: 1 1 1 1 1 1 ...
$ sampdate : Date, format: "2007-12-12" "2008-03-15" ...
$ Ag : num 0 0 0 0 0 0 0 0 0 0 ...
$ Al : num 0.106 0.08 0.116 0.08 0.08 0.08 0.08 0.08 0.08 0.08 ...
$ CO3 : num 1 1 6.7 1 1 1 1 1 1 1 ...
...
$ SC : num 630 633 386 503 83.2 538 1450 1130 1040 940 ...
I knew there was a non-number in there but didn't see it. Your guidance
not only taught me how to find it, but made me aware that while I was
searching in the cleaned up text file R was fed the old version.
Very much appreciated,
Rich
More information about the R-help
mailing list