[R] Data Frame housekeeping
Jonathan Daily
biomathjdaily at gmail.com
Tue May 24 21:24:26 CEST 2011
My suggestion, since bold doesn't show up in a text only mailing list,
would be to look into the function ?aggregate.
It looks like something like (assuming your data is in a mydat):
mydat.new <- aggregate(cbind(STN_ID, YEAR, MM, DAY) ~ ELEM + ?, mydat,
FUN = ?) #this is up to you
Alternatively, the plyr package is great at transforming data.frames.
On Tue, May 24, 2011 at 3:03 PM, Scott Hatcher
<scott.v.hatcher at gmail.com> wrote:
> Hello,
>
> I have a large data frame that is organized by date in a peculiar way. I
> am seeking advice on how to transform the data into a format that is of
> more use to me.
>
> The data is organized as follows:
>
> STN_ID YEAR MM ELEM X1 X2 X3 X4
> X5 X6 X7
> 1 2402594 1997 9 1 *-00233* *-00204* *-00119* -00190 -00251
> -00243 -00249
> 2 2402594 1997 10 1 -00003 -00005 -00001 -00039
> -00031 -00036 -00033
> 3 2402594 1997 11 1 000025 000065 000070 000069
> 000115 000072 000093
> 4 2402594 1997 12 1 000160 000114 000143 000140
> 000093 000068 000157
> 5 2402594 1998 1 1 000067 000095 000139 000113
> 000066 000081 000070
> 6 2402594 1998 2 1 000098 000102 000140 000124
> 000082 000111 000047
> 7 2402594 1998 3 1 -00039 -00006 000015 000015
> 000016 000035 000013
> 8 2402594 1998 4 1 -00035 -00046 -00046 -00062
> -00018 -00025 -00012
> 9 2402594 1998 5 1 000031 000011 -00005 -00061
> -00061 -00080 -00217
> 10 2402594 1997 9 2 *-00339 -00339 -00343* -00346 -00285
> -00253 -00253
> 11 2402594 1997 10 2 -00207 -00289 -00278 -00271
> -00258 -00315 -00341
> 12 2402594 1997 11 2 -00242 -00230 -00206 -00180
> -00256 -00227 -00241
> 13 2402594 1997 12 2 -00155 -00153 -00118 -00066
> -00088 -00073 -00032
> 14 2402594 1998 1 2 000003 -00021 -00033 -00022
> -00014 000001 000008
> 15 2402594 1998 2 2 000050 000077 000106 000073
> 000060 000060 000083
> 16 2402594 1998 3 2 000095 000063 000030 000057
> 000073 000144 000090
> 17 2402594 1998 4 2 000128 000178 000195 000157
> 000160 000160 000117
> 18 2402594 1998 5 2 000074 000064 000051 000027
> 000053 000063 000067
>
> Where "MM" is the month of the year, and ELEM is the variable to which
> the values in the X* columns describe (in the actual data there are 31 X
> columns, one for each day of the month). The values in bold are the
> values that are transferred into the small chart below (which is the
> result I hope to get). This is to give a sense of how the data is picked
> out of the original data frame.
>
> I would like to organize the data so it looks like this:
>
> STN_ID YEAR MM DAY ELEM1 ELEM2
> 1 2402594 1997 9 X1 -00233 -00339
> 2 2402594 1997 9 X2 -00204 000077
> 3 2402594 1997 9 X3 -00119 000030
>
> Such that I create a new column named "DAY" that is made up of the
> numbers following "X" in the original data.frame columns. Also, the ELEM
> values are converted to columns and parsed with the ELEM code (in this
> case 1 and 2).
>
> I have tried to split apart the columns, transform them, and bind them
> back together, but my ability to do so just isn't there yet. I am still
> fairly new to R, and would really appreciate some help in working
> towards organizing this data frame.
>
> Thanks in advance,
> Scott Hatcher
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
===============================================
Jon Daily
Technician
===============================================
#!/usr/bin/env outside
# It's great, trust me.
More information about the R-help
mailing list