[R] binning dates by decade for simulated data

Fri Mar 9 15:14:28 CET 2012

On Mar 8, 2012, at 7:37 PM, Jeff Garcia wrote:

> I have a simulated matrix of dates that I generated from a  
> probability function. Each column represents a single iteration.
>
> I would like to bin each run _separately_ by decades and dump them  
> into a new matrix where each column is the length of all decades a  
> single run with the number dates binned by decade.
>
> I have successfully done this for a single vector of dates, but not  
> for a matrix:
>
> "dates" is a vector of observed data representing when certain trees  
> established in a population
>
> #-----find min and max decade -----#
>
>        minDecade <- min(dates)
>        maxDecade <- max(dates)
>
> #-----create vector of decades -----#
>
>        allDecades <- seq(minDecade, 2001, by=10)
>
> #-----make empty vector of same length as decade vector-----#
>
>        bin.vec <- rep(0,length(allDecades))
>
> #-----populate bin.vec (empty vector) with the number of trees in  
> each decade-----#
>
>        for (i in 1:length(allDecades)) {
>
>                        bin.vec[i] <-  
> length(which(dates==allDecades[i]))
>                }
>
>
> bin.vec : [1]    0  0  0  0  0  0  0  0  0  0  1  1  1  0  1  2  0  1
>               [19]  3  0  1  3  8  5  9  8  5  5  4 10  3  6  9 17  
> 32 37
>               [37] 35 25 31 41 41 44 45 40 50 43 59 42 46 28 16 18  
> 20 16
>               [55] 11  4  7  1
>
>
> My matrix looks like this (it actually had 835 rows, I used head (x)  
> just to demonstrate).
>
> head(bin.mat)
>     [,1]     [,2]     [,3]   [,4]   [,5]    [,6]    [,7]   [,8]    [, 
> 9]     [,10]
> [1,] 1831 1811 1841 1881 1851 1871 1921 1821 1781  1561
> [2,] 1851 1931 1821 1701 1841 1961 1941 1931 1891  1841
> [3,] 1751 1861 1861 1751 1841 1841 1771 1971 1811  1871
> [4,] 1831 1871 1741 1881 1871 1771 1821 1901 1901  1851
> [5,] 1681 1861 1871 1811 1711 1931 1891 1771 1811  1821
> [6,] 1931 1841 1841 1861 1831 1881 1601 1861 1891  1891

After setting up your allDecades vector to your liking perhaps  
something like:

apply( dates, 2, function(colm){
                      1 + max(findInterval(colm, allDecades)) -
                              min(findInterval(colm, allDecades) )
                                 } )

With that data (although changing its name to "years"):

 > years <- scan()
1:  1831 1811 1841 1881 1851 1871 1921 1821 1781  1561
11:  1851 1931 1821 1701 1841 1961 1941 1931 1891  1841
21:  1751 1861 1861 1751 1841 1841 1771 1971 1811  1871
31:  1831 1871 1741 1881 1871 1771 1821 1901 1901  1851
41:  1681 1861 1871 1811 1711 1931 1891 1771 1811  1821
51:  1931 1841 1841 1861 1831 1881 1601 1861 1891  1891
61:
Read 60 items
 > years <- matrix(years, nrow=6, byrow=TRUE)
 >  minDecade <- min(years)
 >        maxDecade <- max(years)
 >        allDecades <- seq(minDecade, 2001, by=10)
 >

 > apply( years, 2, function(colm){
+                      1 + max(findInterval(colm, allDecades)) -
+                              min(findInterval(colm, allDecades) )
+                                 } )
   [1] 26 13 14 19 17 20 35 21 13 34

You have not offered the requested correct answer with your data , so  
I leave it to you to decide whether the rules the 'findInterval' uses  
for determining boundaries with you interval-vector are to your  
requirements.

>
> Each column is a separate run (runs <- 10 ) .  How can I bin each  
> column into decades separately?

That is not a good description of what I did but  following that  
wording would have constructed a result that only a list object could  
have accepted because of the irregular lengths. I decided from your  
sample output that you just wanted a single number to describe the  
span of years.
>
> I'll bet this is super easy, but my R-skills are seriously limited!!!
>
> Thanks for any help!
> ~Jeff
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT