[R] Subsampling-oversampling from a data frame

B77S bps0002 at auburn.edu
Wed Nov 2 00:06:48 CET 2011


If no one has a better solution, split it, take a sample of size X from both
and put it back together.


hgwelec wrote:
> 
> Dear members,
> 
> Consider the following data frame (first 4 rows shown)
> 
> 
>   age sex class
>   15   m   low
>   20   f  high
>   15   f   low
>   10   m   low
> 
> in my original data set i have 1200 rows and a class distribution of
> low=0.3 and high=0.7
> 
> 
> My question : how can i create a new data frame as the one shown above but
> with the 'high' class subsampled so that in the new data frame the class
> distribution is low=0.5 and high=0.5?
> 
> I tried looking at the sample function and prob option but all examples i
> seen do not use an imbalanced class problem as the one shown above
> 
> 
> Thank you in advance
> 
> 
> Thank you in advance
> 


--
View this message in context: http://r.789695.n4.nabble.com/Subsampling-oversampling-from-a-data-frame-tp3965771p3965827.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list