[R] how to merge commands

Nino Pierantonio nino.p.80 at gmail.com
Tue Feb 21 18:45:21 CET 2012


Dear all,

I am using R to work on huge numbers of telemetry data divided by day. 
Each file (an xlsx file) contains 2 rows, the first one for sst readings 
and the second one for chl readings, and 72360 columns, each 
corresponding to the centre of a cell in my study area. The columns have 
no headings. Lots of cells have fake readings (-999.0000000). What I 
want to do is merging the files together, by month and season, replace 
null values with "NA" and then calculate for both sst and chl average 
row values. I have stored the files in the directory C:/TEMP. This 
directory contains 12 subfolders, January to December and each subfolder 
contains a certain number of files, corresponding to the number of days 
for each month (e.g. January 31 files, February 30 files, and so on).

I already have commands that work properly but would really know if it 
is possible to reduce their number and, maybe to do some of them 
automatically. What I do is working "month-by-month" as it follows (I am 
aware this is not the most elegant way to do it, i'm new to R and for 
the moment "elegance&stile" is not my main goal):

 >setwd("C:/Temp/January09")	# to set my working directory
 >library(xlsx)	# to load the "xlsx" library necessary to handle the 
original *.xlsx files
 >list.jan09<-list.files("C:/Temp/January09", full=TRUE)
 >read.all.jan09<-lapply(list.jan09, read.xlsx, 1, header=FALSE)
 >daily.all.jan09<-do.call("cbind",read.all.jan09)	# to create a data 
frame containig all my data
 >daily.sst.jan09<-daily.all.jan09[,seq(from=1,to=61,by=2)]	# to create 
a second data frame containing only sst readings (sst readings 
correspond to the first column of each daily file). The resulting file 
will have 31 columns and 72360 lines
 >daily.chl.jan09<-daily.all.jan09[,seq(from=2,to=62,by=2)]	# to create 
a third data frame containing only chl readings (chl readings correspond 
to the second column of each daily file). The resulting file will have 
31 columns and 72360 lines	
 >daily.sst.jan09<-replace(daily.sst.jan09,daily.sst.jan09==-999.0000000,NA)	# used to replace -999.0000000 values with "NA" 		
 >jan09_avgsst<-rowMeans(daily.sst.jan09)	# to create a vector 
containing the mean sst value of all the rows		
 >write.xlsx(jan09_avgsst, 
"C:/Users/AAA/Desktop/Data/january09_avgsst.xlsx")	# to store the sst 
vector		
 >daily.chl.jan09<-replace(daily.chl.jan09,daily.chl.jan09==-999.0000000,NA)	# used to replace -999.0000000 values with "NA"		
 >jan09_avgchl<-rowMeans(daily.chl.jan09)	# to create a vector 
containing the mean value of all the rows			
 >write.xlsx(jan09_avgchl, 
"C:/Users/AAA/Desktop/Data/january09_avgchl.xlsx")	# to store the chl 
vector	

I repeat these same commands for all the months	and for the seasons 
(January-March; April-June; July-September; October-December), so the 
all thing is a bit redundant.

How can I speed up the process, reduce the commands and maybe make them 
automatically? Many thanks for your help.

Cheers,
Nino

-- 
Nino Pierantonio

Mobile: +39 349.532.9370
Skype: pierantonio_nino

  * Italiano - rilevata
  * Inglese
  * Italiano
  * Francese
  * Spagnolo
  * Tedesco

  * Inglese
  * Italiano
  * Francese
  * Spagnolo
  * Tedesco

  <javascript:void(0);>



More information about the R-help mailing list