[R] counting sets of consecutive integers in a vector
Mike Miller
mbmiller+l at gmail.com
Mon Jan 5 01:03:03 CET 2015
I have a vector of sorted positive integer values (e.g., postive integers
after applying sort() and unique()). For example, this:
c(1,2,5,6,7,8,25,30,31,32,33)
I want to make a matrix from that vector that has two columns: (1) the
first value in every run of consecutive integer values, and (2) the
corresponding number of consecutive values. For example:
c(1:20) would become this...
1 20
...because there are 20 consecutive integers beginning with 1 and
c(1,2,5,6,7,8,25,30,31,32,33) would become
1 2
5 4
25 1
30 4
What would be the best way to accomplish this? Here is my first effort:
v <- c(1,2,5,6,7,8,25,30,31,32,33)
L <- rle( v - 1:length(v) )$lengths
n <- length( L )
matrix( c( v[ c( 1, cumsum(L)+1 ) ][1:n], L), nrow=n)
[,1] [,2]
[1,] 1 2
[2,] 5 4
[3,] 25 1
[4,] 30 4
I suppose that works well enough, but there may be a better way, and
besides, I wouldn't want to deny anyone here the opportunity to solve a
fun puzzle. ;-)
The use for this is that I will be doing repeated seeks of a binary file
to extract data. seek() gives the starting point and readBin(n=X) gives
the number of bytes to read. So when there are many consecutive variables
to be read, I can multiply the X in n=X by that number instead of doing
many different seek() calls. (The data are in a transposed format where I
read in every record for some variable as sequential elements.) I'm
probably not the first person to deal with this.
Best,
Mike
--
Michael B. Miller, Ph.D.
University of Minnesota
http://scholar.google.com/citations?user=EV_phq4AAAAJ
More information about the R-help
mailing list