[R] Binning Question
David Winsemius
dwinsemius at comcast.net
Tue Apr 13 06:06:21 CEST 2010
On Apr 12, 2010, at 9:07 PM, Noah Silverman wrote:
> Hi,
>
> I'm trying to setup some complicated binning with statistics and could
> use a little help.
>
> I've found the bin2 function from the ash package, but it doesn't do
> everything I need. My intention is to copy some of their code and
> then
> modify as needed.
>
> I have a vector of two columns:
>
> head(data)
> r1 r2
> [1,] 0.03516559 0.03102128
> [2,] 0.02162539 0.14847034
> [3,] 0.02210339 0.06539623
> [4,] -0.07547792 -0.08859678
> [5,] 0.03655620 0.05412436
> [6,] 0.06513828 0.06053050
>
>
> I'd like to create a 2 dimension list of bins with the frequency
> counts
> for each bin. The bin2 function does this. Then it gets interesting.
>
> I'd like to add a column to my vector that has the "bin label" for the
> bin that row would belong to. (I can see how to do this with lots of
> nasty loops and greater-than, less-than calculations, but that gets
> messy.) There must be an easier way.
Lets say you used the example in bin2:
dat <- as.data.frame(matrix( rnorm(200), 100 , 2)) # bivariate normal
n=100
ab <- matrix( c(-5,-5,5,5), 2, 2) # interval [-5,5) x [-5,5)
nbin <- c( 20, 20) # 400 bins
bins <- bin2(dat, ab, nbin) # bin counts,ab,nskip
dat$r1.cat <- cut(dat[,1], breaks=seq(ab[1,1], ab[1,2],
length.out=nbin[1]+1 ) )
dat$r2.cat <- cut(dat[,2], breaks=seq(ab[1,1], ab[1,2],
length.out=nbin[1]+1))
dat$bicat <- with(dat, paste( as.numeric(r1.cat), as.numeric(r2.cat),
sep="."))
Or leave off the as.numeric if you want the labels to be more "cut"-
like.
>
> So, If I made 10 bins for each column (r1,r2), I'd have 100 bins.
> (bin1, bin2, bin3, etc.) I want to label each ROW in my data set with
> the bin it would belong to. (I intend to do more work with them after
> this, but this starts. Each row gets transformed depending on the bin
> it belongs to, etc..)
>
> Thanks,
>
> -N
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list