[R] Recursive feature elimination keeping the weights constant
Priyanka Purkayastha
ppurk@y@@th@2010 @ending from gm@il@com
Tue Jan 8 19:54:10 CET 2019
Dear All,
I am trying to build a model by doing recursive elimination of weights one
by one.
This is the example matrix
ID 885038 885039 885040 885041 885042 885043 Label
weights 0.000236 0.004591 0.00017 0.018113 0.000238 0.006537 N/A
1267359 2 0 0 0 0 1 1
1295720 0 0 0 0 0 1 1
1295721 0 0 0 0 0 1 1
1295723 0 0 0 0 0 1 0
1295724 0 0 0 1 0 1 0
1296724 0 0 0 1 0 1 0
12957243 0 0 0 0 0 1 0
12957424 0 0 0 1 0 1 0
12967244 0 0 0 1 0 1 0
12673529 2 0 0 0 0 1 1
1295720 0 0 0 0 0 1 1
12957221 0 0 0 0 0 1 1
Bellow is the code I have written to eliminate minimum rows of weights one
by one and build SVM model.
library(e1071)
library(caret)
library(gplots)
library(ROCR)
data <- read.csv("data.csv", header = TRUE)
rownames(data) <- data[,1]
data<-data[,-1]
for (k in 1:ncol(data))
{
rowMin = which.min(data[1,])
data = data[-rowMin,]
data = data[-1,]
inTraining <- createDataPartition(data$Class, p = .70, list = FALSE)
training <- data[ inTraining,]
testing <- data[-inTraining,]
## Building the model ####
svm.model <- svm(Label ~ ., data = training,
cross=10,metric="ROC",type="eps-regression",kernel="linear",na.action=na.omit,probability
= TRUE)
#prediction and ROC
svm.model$index
svm.pred <- predict(svm.model, testing, probability = TRUE)
#calculating auc
c <- as.numeric(svm.pred)
c = c - 1
pred <- prediction(c, testing$Label)
perf <- performance(pred,"tpr","fpr")
plot(perf,fpr.stop=0.1)
auc <- performance(pred, measure = "auc")
auc <- auc using y.values[[1]]
print(paste(ncol(data),colnames(data)[rowMin],auc))
}
I want my output, like
number of columns, colname with minimum weight, AUC
5 , 885039, 0.67
But I get the following error
Error in svm.default(x, y, scale = scale, ..., na.action = na.action) :
‘cross’ cannot exceed the number of observations!
In addition: Warning message:
In svm.default(x, y, scale = scale, ..., na.action = na.action) :
Variable(s) ‘X885039’ and ‘X885040’ and ‘X885042’ and ‘X885043’ constant.
Cannot scale data.
I
[[alternative HTML version deleted]]
More information about the R-help
mailing list