[R] data.frame: adding a column that is based on ranges of values in another column
jim holtman
jholtman at gmail.com
Mon Jul 5 22:55:03 CEST 2010
use 'merge':
> DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828, 114.8903, 114.9519, 114.8842,
+ 114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265,
46.80584, 46.67022, 46.53264, 46.47727,
+ 46.46457, 46.47032), Date = as.Date(c('2009-01-01', '2009-01-03',
'2009-01-05', '2009-01-10', '2009-01-14',
+ '2009-01-15', '2009-01-16', '2009-01-17', '2009-01-22', '2009-01-29')))
>
> s.d1 = '2009-01-01'
> e.d1 = '2009-01-14'
> f.n1 = seq(from = as.Date(s.d1) , to = as.Date(e.d1), by = 1)
>
> s.d2 = '2009-01-15'
> e.d2 = '2009-01-31'
> f.n2 = seq(from = as.Date(s.d2) , to = as.Date(e.d2), by = 1)
> x.new <- data.frame(Date=c(f.n1, f.n2),
+ Fortnight=c(rep("FN1", length(f.n1)), rep("FN2", length(f.n2))))
>
> merge(DF, x.new, all.x=TRUE)
Date X Y Fortnight
1 2009-01-01 114.5508 47.14094 FN1
2 2009-01-03 114.6468 46.98874 FN1
3 2009-01-05 114.6596 46.91235 FN1
4 2009-01-10 114.6957 46.88265 FN1
5 2009-01-14 114.6828 46.80584 FN1
6 2009-01-15 114.8903 46.67022 FN2
7 2009-01-16 114.9519 46.53264 FN2
8 2009-01-17 114.8842 46.47727 FN2
9 2009-01-22 114.8579 46.46457 FN2
10 2009-01-29 114.8489 46.47032 FN2
On Mon, Jul 5, 2010 at 4:01 PM, Abdi, Abdulhakim <AbdiA at si.edu> wrote:
> Dear List,
>
> I've been looking tirelessly for a solution to this dilemma but without success. Perhaps someone has an idea that will guide me in the right direction.
>
> Suppose I have the following data.frame:
>
> DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828, 114.8903, 114.9519, 114.8842,
> 114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265, 46.80584, 46.67022, 46.53264, 46.47727,
> 46.46457, 46.47032), Date = as.Date(c('2009-01-01', '2009-01-03', '2009-01-05', '2009-01-10', '2009-01-14',
> '2009-01-15', '2009-01-16', '2009-01-17', '2009-01-22', '2009-01-29')))
>
> DF
> X Y Date
> 1 114.5508 47.14094 2009-01-01
> 2 114.6468 46.98874 2009-01-03
> 3 114.6596 46.91235 2009-01-05
> 4 114.6957 46.88265 2009-01-10
> 5 114.6828 46.80584 2009-01-14
> 6 114.8903 46.67022 2009-01-15
> 7 114.9519 46.53264 2009-01-16
> 8 114.8842 46.47727 2009-01-17
> 9 114.8579 46.46457 2009-01-22
> 10 114.8489 46.47032 2009-01-29
>
> I also have two objects that contain the dates of the first and last fortnight of the month of January 2009.
>
> s.d1 = '2009-01-01'
> e.d1 = '2009-01-14'
> f.n1 = seq(from = as.Date(s.d1) , to = as.Date(e.d1), by = 1)
>
> f.n1
> [1] "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" "2009-01-05" "2009-01-06" "2009-01-07" "2009-01-08" "2009-01-09" "2009-01-10" "2009-01-11" "2009-01-12" "2009-01-13" "2009-01-14"
>
> s.d2 = '2009-01-15'
> e.d2 = '2009-01-31'
> f.n2 = seq(from = as.Date(s.d2) , to = as.Date(e.d2), by = 1)
>
> f.n2
> [1] "2009-01-15" "2009-01-16" "2009-01-17" "2009-01-18" "2009-01-19" "2009-01-20" "2009-01-21" "2009-01-22" "2009-01-23" "2009-01-24" "2009-01-25" "2009-01-26" "2009-01-27" "2009-01-28" "2009-01-29" "2009-01-30" "2009-01-31"
>
>
> I'm trying to add a column called "Fortnight" to the existing data.frame. The components of the new "Fortnight" column are based on the existing "Date" column so that if the value in "Date" falls within the first fortnight (f.n1) then the value of the new "Fortnight" column would be "FN1", and if the value of the "Date" column falls within the second fortnight (f.n2), then the value of the "Fortnight" column would be "FN2", and so on.
>
> The end result should look like:
>
> X Y Date Fortnight
> 1 114.5508 47.14094 2009-01-01 FN1
> 2 114.6468 46.98874 2009-01-03 FN1
> 3 114.6596 46.91235 2009-01-05 FN1
> 4 114.6957 46.88265 2009-01-10 FN1
> 5 114.6828 46.80584 2009-01-14 FN1
> 6 114.8903 46.67022 2009-01-15 FN2
> 7 114.9519 46.53264 2009-01-16 FN2
> 8 114.8842 46.47727 2009-01-17 FN2
> 9 114.8579 46.46457 2009-01-22 FN2
> 10 114.8489 46.47032 2009-01-29 FN2
>
> I manually entered the above values for the "Fortnight" column to illustrate my point, however, that would be quite tiresome for 500+ rows of data ;-)
>
> The only other similar issue I found on the list was https://stat.ethz.ch/pipermail/r-help/2008-February/153995.html but that particular problem is slightly different than what I'm trying to accomplish here.
>
> I appreciate your time and assistance.
>
> Thanks in advance.
>
> Regards,
>
>
> Hakim Abdi
>
>
>
> _________________________________
> Abdulhakim Abdi, M.Sc.
> Research Intern
>
> Conservation GIS/Remote Sensing Lab
> Smithsonian Conservation Biology Institute
> 1500 Remount Road
> Front Royal, VA 22630
> phone: +1 540 635 6578
> mobile: +1 747 224 7006
> fax: +1 540 635 6506 (Attn:GIS Lab)
> email: abdia at si.edu
> http://nationalzoo.si.edu/SCBI/ConservationGIS/
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
More information about the R-help
mailing list