[R] Help : delete at random
Adaikalavan Ramasamy
ramasamy at cancer.org.uk
Tue Mar 1 16:54:21 CET 2005
Might be slightly more interesting. If we want to generate values which
are completely missing at random, then we can just simply sample all
available index of a 2-d array.
# simulate data #
set.seed(1) # for reproducibility
m <- matrix( rnorm(12), nr=4, nc=3 )
m
[,1] [,2] [,3]
[1,] -0.6264538 0.3295078 0.5757814
[2,] 0.1836433 -0.8204684 -0.3053884
[3,] -0.8356286 0.4874291 1.5117812
[4,] 1.5952808 0.7383247 0.3898432
indices <- expand.grid( row=1:nrow(m), col=1:ncol(m) )
# generate all possible indices
N <- ncol(m)*nrow(m) # number of total elements
Now suppose you want to generate 25% missing values, then
k <- round( 0.25 * N )
w <- as.matrix( indices[ sample( 1:N, k ), ] )
w # shows the row and column numbers that will be imputed
row col
4 4 1
5 1 2
1 1 1
m[ w ] <- NA # impute NAs
m
[,1] [,2] [,3]
[1,] NA NA 0.5757814
[2,] 0.1836433 -0.8204684 -0.3053884
[3,] -0.8356286 0.4874291 1.5117812
[4,] NA 0.7383247 0.3898432
Regards, Adai
On Tue, 2005-03-01 at 15:30 +0100, Uwe Ligges wrote:
> Caroline TRUNTZER wrote:
> > Hello
> > I would like to delete some values at random in a data frame. Does
> > anyone know how I could do?
>
> What about sample()-ing (if I understand "at random" correctly) a
> certain number of values from 1:nrow(data) and using the result as
> negative index the data.frame?
>
> Uwe Ligges
>
>
> > With best regards
> > Caroline
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
More information about the R-help
mailing list