[R] 2 Seemingly Simple Problems
ripley@stats.ox.ac.uk
ripley at stats.ox.ac.uk
Fri May 31 16:07:34 CEST 2002
On Fri, 31 May 2002, MATT BORKOWSKI wrote:
> Alright...these two issues seem rather simple. But I had trouble finding much
> about either of them in the archives.
>
> 1) Using scan()
> I'm trying to use scan to read in a large data set since read.table() is taking
> quite a bit of time. But when I try to do this I receive a error message along
> the lines of "Character where numeric expected." Seems to me the problem is
> arising because my data is composed of both characters and numbers, but R
> is only expecting numerics. I assume the key to this problem lies in the
> "what=" parameter. But I'm not sure what to set this to so that R expects
> characters or numbers.
See the help page for scan, especially the examples. However, since
read.table calls scan itself, you will get little gain provided you use
colClasses in read.table.
> 2) Testing for 'NA' values
> In this problem I have read in a large data set. Some of the lines of data are
> not as long and therefore the last few columns have been filled in with 'NA.'
> Now I'm trying to read through rows of data backwards because the parameter
> I'm trying to extract from the data.frame is not always in column 5 but is always
> the second real value after the 'NA's' if that makes any sense. But I don't think
(No. The NAs are at the end of the row, so the second before?)
> that's all that important anyway. The point is...I'm trying to extract the second
> value after the 'NA' values by ignoring the 'NA' values and couting any real
> values. I'm trying to accomplish this with:
>
> if(data[r,c] != NA) count <- count +1
>
> However, I receive the error: "Value missing where logical expected". I assume
> this is happening because I'm testing for 'NA' values. Is there anyway around
> this? Is there a way to count the number of 'NA' numbers or a way to skip over
> them?
is.na(data[r,]) would be a good start. Something like
{xx <- is.na(data[r,]); n <- length(xx); data[r, n-1]}
for one row perhaps? Or to vectorize
nn <- colSums(!is.na(data)) # number of non-NA values in each row
data[cbind(seq(along=nn), nn-1)]
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list