[R] Unwanted Levels in R
Peter Dalgaard BSA
p.dalgaard at biostat.ku.dk
Tue May 21 16:34:53 CEST 2002
"MATT BORKOWSKI" <mpb170 at psu.edu> writes:
> To clarify:
> The lines beginning with A,B,C,D,E are part of a header file. Below the header
> are lines that contain values that correspond. The problem is that there are
> a number of data sets combined, so the header randomly repeats after an
> varying number of data lines. Would it solve the problem to simply treat the line
> that begin with A,B,C,D,, or E differently? If so, how do they need to be treated?
> I've copied a bit more of the data below to demonstrate more clearly how the
> data is arranged within the file.
>
> A 900003024 ODEN SWEDEN ODEN91 NSIDC.ORG/PROJE
> B 900003 -9 1 NAN OBS 0
> C 1991 9 7 13 -9 XX 90.0000 .0000 XX
> D 36 10.0 10.1 4183.0 4270.7 4219.0 Z 13 0 OBSERV
> E -9.0 -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000
> 25.0 25.3 -1.7050 -1.7054 31.4970 25.3313 34.8074 43.8571 -9.0000 8.630
> 50.0 50.6 -1.7400 -1.7408 32.3660 26.0382 35.5010 44.5377 -9.0000 8.280
> 89.0 90.0 -1.6550 -1.6566 32.8530 26.4320 35.8807 44.9043 -9.0000 7.430
> 109.0 110.3 -1.5420 -1.5444 33.8830 27.2659 36.6893 45.6886 -9.0000 7.360
> ...
> ...
> ...
> A 900002034 LOUIS ST: LAURENT UNITED STATES AO1994 NSIDC.ORG/PROJE
> B 900002 -9 1 NAN OBS 0
> C 1994 8 20 22 -9 XX 89.0167 137.1517 XX
> D 36 13.0 13.1 4075.0 4159.4 4075.0 Z 13 0 LASTLE
> E -9.0 -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000
> 13.0 13.1 -1.7650 -1.7652 32.9160 26.4856 35.9403 44.9690 -9.0000 8.580
Hmm. If you're on a Unix(-like) system, I suggest you preprocess with
grep -v "^[A-E]". On Windows, you could fetch the grep program and do
likewise (there is one in
http://www.stats.ox.ac.uk/pub/Rtools/tools.zip).
In pure R, I suppose a combination of readLines(), grep(),
writeLines() (to a temp file) and read.table() would do the trick.
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list