[R] deleting invariant rows and cols in a matrix
Patrick McKnight
pem at theriver.com
Sun May 12 00:49:50 CEST 2002
Greetings,
I couldn't find any existing function that would allow me to scan a
matrix and eliminate invariant rows and columns so I have started to
write a simple routine from scratch. The following code fails because
the array index goes out of bounds for obvious reasons you'll see
shortly.
Start with some data
x <- read.table("myex.dat",header=T)
x
v1 v2 v3 v4 v5 id
1 1 0 1 2 4 1
2 1 1 1 1 1 2
3 1 2 3 1 4 3
4 1 3 4 2 4 4
5 2 2 2 2 2 5
Here's my function
---- begin R code ----
elimnovar <- function(x,first.item=NULL,nitems=NULL,responses=NULL){
# Data prep - store as matrix, strip off id's, get variable names
dat <- as.matrix(x)
item.dat <- dat[,first.item:(first.item + nitems - 1)]
inames <- dimnames(item.dat)[[2]]
# Eliminate zero variance items and persons
# Store data in temp name and keep original
clean <- item.dat
# Initialize the stop variable
stp <- 0
# Start cleanup process for both cols and rows
while (stp != 1){
stp.row <- rep(0,nrow(clean))
stp.col <- rep(0,ncol(clean))
# Start with rows
for (i in 1:nrow(clean)){
sdrow <- sd(clean[i,])
if (sdrow==0) clean <- clean[i * -1,]
if (sdrow==0) stp.row[i] <- 1
}
# Next check columns
for (j in 1:ncol(clean)){
sdcol <- sd(clean[,j])
if (sdcol==0) clean <- clean[,j * -1]
if (sdcol==0) stp.col[j] <- 1
}
# Do we need to continue with the process?
if (sum(stp.row)==0 && sum(stp.col)==0) stp <- 1
}
# Output cleaned data to new dataset name
cleaned <<- clean
}
---- end R code ----
So my questions are:
1. How can I create an array of rows and column numbers to later
delete? I realize that the code above is running into problems because
the for loop is indexing non-existent rows/cols after they have been
deleted. The deletion process must occur after the loop. I know how to
easily drop a row or a column while in the for loop but storing those
rows and column numbers and then deleting them after the loop just
escapes me. Any suggestions?
2. Is there a more efficient way to complete this task? I don't
proclaim to be a programmer - a hack at best - but I can't imagine that
there is not a simpler method for achieving the goal of eliminating
invariant rows and columns.
Thanks in advance for any and all suggestions.
--
Cheers,
Patrick
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list