[R] Splitting a vector into equal groups

Berwin A Turlach berwin at maths.uwa.edu.au
Mon May 4 09:11:12 CEST 2009


G'day Utkarsh,

On Mon, 04 May 2009 11:51:21 +0530
utkarshsinghal <utkarsh.singhal at global-analytics.com> wrote:

> I have vector of length 52, say, x=sample(30,52,replace=T). I want to 
> sort x and split into five *nearly equal groups*.

What do you mean by *nearly equal groups*?  The size of the groups
should be nearly equal? The sum of the elements of the groups should be
nearly equal?

> Note that the observations are repeated in x so in case of a tie I
> want both the observations to fall in same group.

Then it becomes even more important to define what you mean with
"nearly equal groups".

As a start, you may consider:

R> set.seed(1)
R> x=sample(30,52,replace=T)
R> xrle <- rle(sort(x))
R> xrle
Run Length Encoding
  lengths: int [1:25] 2 1 2 2 3 1 1 1 5 1 ...
  values : int [1:25] 1 2 4 6 7 8 9 11 12 13 ...
R> cumsum(xrle$lengths)
 [1]  2  3  5  7 10 11 12 13 18 19 24 25 26 28 29 32 35 38
[19] 43 45 46 48 49 51 52

and use this to determine our cut-offs.  E.g., should the first group
have 10, 11 or 12 elements in this case?  The information in xrle
should enable you to construct your five groups once you have decided
on a grouping.

HTH.

Cheers,

	Berwin

=========================== Full address =============================
Berwin A Turlach                            Tel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability        +65 6516 6650 (self)
Faculty of Science                          FAX : +65 6872 3919       
National University of Singapore     
6 Science Drive 2, Blk S16, Level 7          e-mail: statba at nus.edu.sg
Singapore 117546                    http://www.stat.nus.edu.sg/~statba




More information about the R-help mailing list