[R] how to merge commands
Nino Pierantonio
nino.p.80 at gmail.com
Tue Feb 21 18:45:21 CET 2012
Dear all,
I am using R to work on huge numbers of telemetry data divided by day.
Each file (an xlsx file) contains 2 rows, the first one for sst readings
and the second one for chl readings, and 72360 columns, each
corresponding to the centre of a cell in my study area. The columns have
no headings. Lots of cells have fake readings (-999.0000000). What I
want to do is merging the files together, by month and season, replace
null values with "NA" and then calculate for both sst and chl average
row values. I have stored the files in the directory C:/TEMP. This
directory contains 12 subfolders, January to December and each subfolder
contains a certain number of files, corresponding to the number of days
for each month (e.g. January 31 files, February 30 files, and so on).
I already have commands that work properly but would really know if it
is possible to reduce their number and, maybe to do some of them
automatically. What I do is working "month-by-month" as it follows (I am
aware this is not the most elegant way to do it, i'm new to R and for
the moment "elegance&stile" is not my main goal):
>setwd("C:/Temp/January09") # to set my working directory
>library(xlsx) # to load the "xlsx" library necessary to handle the
original *.xlsx files
>list.jan09<-list.files("C:/Temp/January09", full=TRUE)
>read.all.jan09<-lapply(list.jan09, read.xlsx, 1, header=FALSE)
>daily.all.jan09<-do.call("cbind",read.all.jan09) # to create a data
frame containig all my data
>daily.sst.jan09<-daily.all.jan09[,seq(from=1,to=61,by=2)] # to create
a second data frame containing only sst readings (sst readings
correspond to the first column of each daily file). The resulting file
will have 31 columns and 72360 lines
>daily.chl.jan09<-daily.all.jan09[,seq(from=2,to=62,by=2)] # to create
a third data frame containing only chl readings (chl readings correspond
to the second column of each daily file). The resulting file will have
31 columns and 72360 lines
>daily.sst.jan09<-replace(daily.sst.jan09,daily.sst.jan09==-999.0000000,NA) # used to replace -999.0000000 values with "NA"
>jan09_avgsst<-rowMeans(daily.sst.jan09) # to create a vector
containing the mean sst value of all the rows
>write.xlsx(jan09_avgsst,
"C:/Users/AAA/Desktop/Data/january09_avgsst.xlsx") # to store the sst
vector
>daily.chl.jan09<-replace(daily.chl.jan09,daily.chl.jan09==-999.0000000,NA) # used to replace -999.0000000 values with "NA"
>jan09_avgchl<-rowMeans(daily.chl.jan09) # to create a vector
containing the mean value of all the rows
>write.xlsx(jan09_avgchl,
"C:/Users/AAA/Desktop/Data/january09_avgchl.xlsx") # to store the chl
vector
I repeat these same commands for all the months and for the seasons
(January-March; April-June; July-September; October-December), so the
all thing is a bit redundant.
How can I speed up the process, reduce the commands and maybe make them
automatically? Many thanks for your help.
Cheers,
Nino
--
Nino Pierantonio
Mobile: +39 349.532.9370
Skype: pierantonio_nino
* Italiano - rilevata
* Inglese
* Italiano
* Francese
* Spagnolo
* Tedesco
* Inglese
* Italiano
* Francese
* Spagnolo
* Tedesco
<javascript:void(0);>
More information about the R-help
mailing list