[R] data format
arun
smartpink111 at yahoo.com
Wed Feb 20 15:30:31 CET 2013
Hi elisa,
Try this:
mat1<-matrix(signif(c(1.200407,1.861941,1.560613,2.129241,2.047772,1.784105,1.777159,1.988596,2.163199,2.446993,3.593623,5.706672),digits=3),ncol=1)
list1<- list(mat1,mat1,mat1)
list2<-lapply(list1,function(x) data.frame(date1=format(seq.Date(as.Date("1911.01.01",format="%Y.%m.%d"),by="month",length.out=12),format="%Y.%m.%d"),value=x,stringsAsFactors=FALSE))
list3<- lapply(list2,function(x){ substr(x[,1],6,6)<- ifelse(substr(x[,1],6,6)==0," ",substr(x[,1],6,6));substr(x[,1],9,9)<- ifelse(substr(x[,1],9,9)==0," ",substr(x[,1],9,9));x})
list4<- lapply(list3,function(x) {x[,2]<-sprintf("%.2f",x[,2]);data.frame(col1=c("EXACT DATA","FROM 1911 1 1 TO 1911 12 1",do.call(paste,x)),stringsAsFactors=FALSE)})
list4[[1]]
# col1
#1 EXACT DATA
#2 FROM 1911 1 1 TO 1911 12 1
#3 1911. 1. 1 1.20
#4 1911. 2. 1 1.86
#5 1911. 3. 1 1.56
#6 1911. 4. 1 2.13
#7 1911. 5. 1 2.05
#8 1911. 6. 1 1.78
#9 1911. 7. 1 1.78
#10 1911. 8. 1 1.99
#11 1911. 9. 1 2.16
#12 1911.10. 1 2.45
#13 1911.11. 1 3.59
#14 1911.12. 1 5.71
A.K.
________________________________
From: eliza botto <eliza_botto at hotmail.com>
To: "smartpink111 at yahoo.com" <smartpink111 at yahoo.com>
Sent: Wednesday, February 20, 2013 8:25 AM
Subject: RE: data format
Dear Arun,
i have a slight inquiry, and i hope you wont mind
if i have a list of 124 like the following
[[1]]
[,1]
[1,] 1.200407
[2,] 1.861941
[3,] 1.560613
[4,] 2.129241
[5,] 2.047772
[6,] 1.784105
[7,] 1.777159
[8,] 1.988596
[9,] 2.163199
[10,] 2.446993
[11,] 3.593623
[12,] 5.706672
and i want them all in the following manner
[[1]]
EXACT DATA
FROM 1911 1 1 TO 1911 12 1
1911. 1. 1 1.20
1911. 2. 1 1.86
1911. 3. 1 1.56
1911. 4. 1 2.12
1911. 5. 1 2.04
1911. 6. 1 1.78
1911. 7. 1 1.77
1911. 8. 1 1.98
1911. 9. 1 2.16
1911.10. 1 2.44
1911.11. 1 3.59
1911.12. 1 5.70
date pattern should be same as before and
the following two line should be inserted on the top of every list
"EXACT DATA
FROM 1911 1 1 TO 1911 12 1"
thankyou so very much in advance. i hope you wont my frequent questions
elisa
> Date: Tue, 19 Feb 2013 08:18:25 -0800
> From: smartpink111 at yahoo.com
> Subject: Re: data format
> To: eliza_botto at hotmail.com
>
>
>
> Hi Elisa,
> No problem.
> Arun
>
>
>
>
> ________________________________
> From: eliza botto <eliza_botto at hotmail.com>
> To: "smartpink111 at yahoo.com" <smartpink111 at yahoo.com>
> Sent: Tuesday, February 19, 2013 11:10 AM
> Subject: RE: data format
>
>
>
> Thanks arun. it worked!!
> i am so glad....
>
> elisa
>
>
> > Date: Tue, 19 Feb 2013 07:22:20 -0800
> > From: smartpink111 at yahoo.com
> > Subject: Re: data format
> > To: eliza_botto at hotmail.com
> > CC: r-help at r-project.org
> >
> > Hi,
> > Try this:
> > el<- read.csv("el.csv",header=TRUE,sep="\t",stringsAsFactors=FALSE)
> > elsplit<- split(el,el$st)
> >
> > datetrial<-data.frame(date1=seq.Date(as.Date("1930.1.1",format="%Y.%m.%d"),as.Date("2010.12.31",format="%Y.%m.%d"),by="day"))
> > elsplit1<- lapply(elsplit,function(x) data.frame(date1=as.Date(paste(x[,2],x[,3],x[,4],sep="-"),format="%Y-%m-%d"),discharge=x[,5]))
> > elsplit2<-lapply(elsplit1,function(x) x[order(x[,1]),])
> > library(plyr)
> > elsplit3<-lapply(elsplit2,function(x) join(datetrial,x,by="date1",type="full"))
> > elsplit4<-lapply(elsplit3,function(x) {x[,2][is.na(x[,2])]<- "-9999.000";x})
> > elsplit5<-lapply(elsplit4,function(x) {x[,1]<-format(x[,1],"%Y.%m.%d");x})
> > elsplit6<-lapply(elsplit5,function(x){substr(x[,1],6,6)<-ifelse(substr(x[,1],6,6)==0," ",substr(x[,1],6,6));substr(x[,1],9,9)<- ifelse(substr(x[,1],9,9)==0," ",substr(x[,1],9,9));x})
> > elsplit6[[1]][1:4,]
> > # date1 discharge
> > #1 1930. 1. 1 -9999.000
> > #2 1930. 1. 2 -9999.000
> > #3 1930. 1. 3 -9999.000
> > #4 1930. 1. 4 -9999.000
> >
> > length(elsplit6)
> > #[1] 124
> > tail(elsplit6[[124]],25)
> > # date1 discharge
> > #29561 2010.12. 7 -9999.000
> > #29562 2010.12. 8 -9999.000
> > #29563 2010.12. 9 -9999.000
> > #29564 2010.12.10 -9999.000
> > #29565 2010.12.11 -9999.000
> > #29566 2010.12.12 -9999.000
> > #29567 2010.12.13 -9999.000
> > #29568 2010.12.14 -9999.000
> > #29569 2010.12.15 -9999.000
> > #29570 2010.12.16 -9999.000
> > #29571 2010.12.17 -9999.000
> > #29572 2010.12.18 -9999.000
> > #29573 2010.12.19 -9999.000
> > #29574 2010.12.20 -9999.000
> > #29575 2010.12.21 -9999.000
> > #29576 2010.12.22 -9999.000
> > #29577 2010.12.23 -9999.000
> > #29578 2010.12.24 -9999.000
> > #29579 2010.12.25 -9999.000
> > #29580 2010.12.26 -9999.000
> > #29581 2010.12.27 -9999.000
> > #29582 2010.12.28 -9999.000
> > #29583 2010.12.29 -9999.000
> > #29584 2010.12.30 -9999.000
> > #29585 2010.12.31 -9999.000
> >
> > str(head(elsplit6,3))
> > #List of 3
> > # $ AGOMO:'data.frame': 29585 obs. of 2 variables:
> > # ..$ date1 : chr [1:29585] "1930. 1. 1" "1930. 1. 2" "1930. 1. 3" "1930. 1. 4" ...
> > #..$ discharge: chr [1:29585] "-9999.000" "-9999.000" "-9999.000" "-9999.000" ...
> > #$ AGONO:'data.frame': 29585 obs. of 2 variables:
> > #..$ date1 : chr [1:29585] "1930. 1. 1" "1930. 1. 2" "1930. 1. 3" "1930. 1. 4" ...
> > #..$ discharge: chr [1:29585] "-9999.000" "-9999.000" "-9999.000" "-9999.000" ...
> > #$ ANZMA:'data.frame': 29585 obs. of 2 variables:
> > #..$ date1 : chr [1:29585] "1930. 1. 1" "1930. 1. 2" "1930. 1. 3" "1930. 1. 4" ...
> > #..$ discharge: chr [1:29585] "-9999.000" "-9999.000" "-9999.000" "-9999.000" ...
> >
> >
> > Regarding the space between date1 and discharge, I haven't checked it as you didn't mention whether it is needed in data.frame or not.
> >
> > A.K.
> >
> >
> >
> >
> >
> >
> > ________________________________
> > From: eliza botto <eliza_botto at hotmail.com>
> > To: "smartpink111 at yahoo.com" <smartpink111 at yahoo.com>
> > Sent: Tuesday, February 19, 2013 10:01 AM
> > Subject: RE:
> >
> >
> >
> > THANKS ARUN..
> > ITS A CHARACTER....
> > SORRY FOR NOT TELLING YOU IN ADVANCE
> >
> > ELISA
> >
> >
> > > Date: Tue, 19 Feb 2013 07:00:03 -0800
> > > From: smartpink111 at yahoo.com
> > > Subject: Re:
> > > To: eliza_botto at hotmail.com
> > >
> > >
> > >
> > > Hi,
> > > One more doubt.
> > > You mentioned about -9999.000. Is it going to be a number or character like "-9999.000"? If it is a number, the final product will be -9999.
> > > Arun
> > >
> > >
> > >
> > >
> > > ________________________________
> > > From: eliza botto <eliza_botto at hotmail.com>
> > > To: "smartpink111 at yahoo.com" <smartpink111 at yahoo.com>
> > > Sent: Tuesday, February 19, 2013 9:16 AM
> > > Subject: RE:
> > >
> > >
> > >
> > > How can u be wrong arun?? you are right.....
> > >
> > > elisa
> > >
> > >
> > > > Date: Tue, 19 Feb 2013 06:15:31 -0800
> > > > From: smartpink111 at yahoo.com
> > > > Subject: Re:
> > > > To: eliza_botto at hotmail.com
> > > >
> > > > Hi Elisa,
> > > >
> > > > Just a doubt regarding the format of the date. Is it the same format as the previous one? 0 replaced by one space if either month or day is less than 10. Also, if I am correct, the list elements are for the different stationname, right?
> > > > Arun
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > ________________________________
> > > > From: eliza botto <eliza_botto at hotmail.com>
> > > > To: "smartpink111 at yahoo.com" <smartpink111 at yahoo.com>
> > > > Sent: Tuesday, February 19, 2013 8:35 AM
> > > > Subject:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Dear Arun,
> > > > [Text file is also attached if format is changed, where as el is data file
> > > > Attached with email is the excel file with contains the data. the data is following form
> > > >
> > > > col1. col2. col3.col4.col5.
> > > > stationname year month day discharge
> > > > A 2004 11232
> > > > A 2004 1 2 334
> > > > .............................
> > > > ........................
> > > > B 2009 11 323
> > > > B 2009 12332
> > > >
> > > >
> > > > There are stations where data starts from and ends at different years but i want each year to start from 1930 and ends at 2010 with -9999.000 for those days when data is missing. i want to make a list which should appear like the following
> > > >
> > > > [[A]]
> > > > 1930. 1. 1 -9999.000
> > > > 1930. 1. 2 -9999.000
> > > > 1930. 1. 3 -9999.000
> > > > 1930. 1. 4 -9999.000
> > > > 1930. 1. 5 -9999.000
> > > > 1930. 1. 6 -9999.000
> > > > 1930. 1. 7 -9999.000
> > > > 1930. 1. 8 -9999.000
> > > > 1930. 1. 9 -9999.000
> > > > 1930. 1.10 -9999.000
> > > > 1930. 1.11 -9999.000
> > > > 1930. 1.12 -9999.000
> > > > 1930. 1.13 -9999.000
> > > > ....................
> > > > ....................
> > > > ....................
> > > > 2004. 1. 1 232.0
> > > > 2004. 1. 2 334.0
> > > > ..................
> > > > ..................
> > > > 2004.12. 1 113.56
> > > > ....
> > > > ...
> > > > 2004.12.31 114.56
> > > >
> > > > [[B]]
> > > > 1930. 1. 1 -9999.000
> > > > 1930. 1. 2 -9999.000
> > > > 1930. 1. 3 -9999.000
> > > > 1930. 1. 4 -9999.000
> > > > 1930. 1. 5 -9999.000
> > > > 1930. 1. 6 -9999.000
> > > > 1930. 1. 7 -9999.000
> > > > 1930. 1. 8 -9999.000
> > > > 1930. 1. 9 -9999.000
> > > > 1930. 1.10 -9999.000
> > > > 1930. 1.11 -9999.000
> > > > 1930. 1.12 -9999.000
> > > > 1930. 1.13 -9999.000
> > > > ....................
> > > > ....................
> > > > ....................
> > > > 2007. 1. 1 23.0
> > > > 2007. 1. 2 33.0
> > > > ..................
> > > > ..................
> > > > 2007.12. 1 13.56
> > > > ....
> > > > ...
> > > > 2007.12.31 4.56
> > > >
> > > >
> > > > Alongside the usual format of starting and ending....... There are stations like "BRRSD", where data is for the years 2001, 2002, 2009 and 2010, i want -9999.000 to be inserted for each day of 2003, 2004, 2005, 2006, 2007, 2008 as data is not avaliable for them.
> > > > The date format should be the way written above. just one request would be to not share my data file on R forum.
> > > >
> > > > thankyou so very much in advance
> > > >
> > > > elisa
More information about the R-help
mailing list