[R] Recursive Feature Elimination with SVM

Wed Jan 2 08:13:36 CET 2019

This is the code I tried,

library(e1071)
library(caret)
library(ROCR)

data <- read.csv("data.csv", header = TRUE)
set.seed(998)

inTraining <- createDataPartition(data$Class, p = .70, list = FALSE)
training <- data[ inTraining,]
testing  <- data[-inTraining,]

while(length(data)>0){

## Building the model ####
svm.model <- svm(Class ~ ., data = training,
cross=10,metric="ROC",type="eps-regression",kernel="linear",na.action=na.omit,probability
= TRUE)
print(svm.model)

###### auc  measure #######

#prediction and ROC
svm.model$index
svm.pred <- predict(svm.model, testing, probability = TRUE)

#calculating auc
c <- as.numeric(svm.pred)
c = c - 1
pred <- prediction(c, testing$Class)
perf <- performance(pred,"tpr","fpr")
plot(perf,fpr.stop=0.1)
auc <- performance(pred, measure = "auc")
auc <- auc using y.values[[1]]
print(length(data))
print(auc)

#compute the weight vector
w = t(svm.model$coefs)%*%svm.model$SV

#compute ranking criteria
weight_matrix = w * w

#rank the features
w_transpose <- t(weight_matrix)
w2 <- as.matrix(w_transpose[order(w_transpose[,1], decreasing = FALSE),])
a <- as.matrix(w2[which(w2 == max(w2)),]) #to get the rows with minimum
values
row.names(a) -> remove
training<- data[,setdiff(colnames(data),remove)]
}

On Wed, Jan 2, 2019 at 11:18 AM David Winsemius <dwinsemius using comcast.net>
wrote:

>
> On 1/1/19 5:31 PM, Priyanka Purkayastha wrote:
> > Thankyou David.. I tried the same, I gave x as the data matrix and y
> > as the class label. But it returned an empty "featureRankedList". I
> > get no output when I try the code.
>
>
> If you want people to spend time on this you should post a reproducible
> example. See the Posting Guide ... and learn to post in plain text.
>
>
> --
>
> David
>
> >
> > On Tue, 1 Jan 2019 at 11:42 PM, David Winsemius
> > <dwinsemius using comcast.net <mailto:dwinsemius using comcast.net>> wrote:
> >
> >
> >     On 1/1/19 4:40 AM, Priyanka Purkayastha wrote:
> >     > I have a dataset (data) with 700 rows and 7000 columns. I am
> >     trying to do
> >     > recursive feature selection with the SVM model. A quick google
> >     search
> >     > helped me get a code for a recursive search with SVM. However, I
> >     am unable
> >     > to understand the first part of the code, How do I introduce my
> >     dataset in
> >     > the code?
> >
> >
> >     Generally the "labels" is given to such a machine learning device
> >     as the
> >     y argument, while the "features" are passed as a matrix to the x
> >     argument.
> >
> >
> >     --
> >
> >     David.
> >
> >     >
> >     > If the dataset is a matrix, named data. Please give me an
> >     example for
> >     > recursive feature selection with SVM. Bellow is the code I got for
> >     > recursive feature search.
> >     >
> >     >      svmrfeFeatureRanking = function(x,y){
> >     >
> >     >      #Checking for the variables
> >     >      stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE)
> >     >
> >     >      n = ncol(x)
> >     >      survivingFeaturesIndexes = seq_len(n)
> >     >      featureRankedList = vector(length=n)
> >     >      rankedFeatureIndex = n
> >     >
> >     >      while(length(survivingFeaturesIndexes)>0){
> >     >      #train the support vector machine
> >     >      svmModel = svm(x[, survivingFeaturesIndexes], y, cost = 10,
> >     > cachesize=500,
> >     >                  scale=FALSE, type="C-classification",
> >     kernel="linear" )
> >     >
> >     >      #compute the weight vector
> >     >      w = t(svmModel$coefs)%*%svmModel$SV
> >     >
> >     >      #compute ranking criteria
> >     >      rankingCriteria = w * w
> >     >
> >     >      #rank the features
> >     >      ranking = sort(rankingCriteria, index.return = TRUE)$ix
> >     >
> >     >      #update feature ranked list
> >     >      featureRankedList[rankedFeatureIndex] =
> >     > survivingFeaturesIndexes[ranking[1]]
> >     >      rankedFeatureIndex = rankedFeatureIndex - 1
> >     >
> >     >      #eliminate the feature with smallest ranking criterion
> >     >      (survivingFeaturesIndexes =
> >     survivingFeaturesIndexes[-ranking[1]])}
> >     >      return (featureRankedList)}
> >     >
> >     >
> >     >
> >     > I tried taking an idea from the above code and incorporate the
> >     idea in my
> >     > code as shown below
> >     >
> >     >      library(e1071)
> >     >      library(caret)
> >     >
> >     >      data<- read.csv("matrix.csv", header = TRUE)
> >     >
> >     >      x <- data
> >     >      y <- as.factor(data$Class)
> >     >
> >     >      svmrfeFeatureRanking = function(x,y){
> >     >
> >     >        #Checking for the variables
> >     >        stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE)
> >     >
> >     >        n = ncol(x)
> >     >        survivingFeaturesIndexes = seq_len(n)
> >     >        featureRankedList = vector(length=n)
> >     >        rankedFeatureIndex = n
> >     >
> >     >        while(length(survivingFeaturesIndexes)>0){
> >     >          #train the support vector machine
> >     >          svmModel = svm(x[, survivingFeaturesIndexes], y,
> >     cross=10,cost =
> >     > 10, type="C-classification", kernel="linear" )
> >     >
> >     >          #compute the weight vector
> >     >          w = t(svmModel$coefs)%*%svmModel$SV
> >     >
> >     >          #compute ranking criteria
> >     >          rankingCriteria = w * w
> >     >
> >     >          #rank the features
> >     >          ranking = sort(rankingCriteria, index.return = TRUE)$ix
> >     >
> >     >          #update feature ranked list
> >     >          featureRankedList[rankedFeatureIndex] =
> >     > survivingFeaturesIndexes[ranking[1]]
> >     >          rankedFeatureIndex = rankedFeatureIndex - 1
> >     >
> >     >          #eliminate the feature with smallest ranking criterion
> >     >          (survivingFeaturesIndexes =
> >     survivingFeaturesIndexes[-ranking[1]])}
> >     >
> >     >        return (featureRankedList)}
> >     >
> >     > But couldn't do anything at the stage "update feature ranked list"
> >     > Please guide
> >     >
> >     >       [[alternative HTML version deleted]]
> >     >
> >     > ______________________________________________
> >     > R-help using r-project.org <mailto:R-help using r-project.org> mailing list
> >     -- To UNSUBSCRIBE and more, see
> >     > https://stat.ethz.ch/mailman/listinfo/r-help
> >     > PLEASE do read the posting guide
> >     http://www.R-project.org/posting-guide.html
> >     > and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> > Regards,
> >
> > Priyanka Purkayastha, M.Tech, Ph.D.,
> > SERB National Postdoctoral Researcher
> > Genomics and Systems Biology Lab,
> > Department of Chemical Engineering,
> > Indian Institute of Technology Bombay (IITB),
> > Powai, Mumbai- 400076
> >
> >
> >
>

-- 
Regards,

Priyanka Purkayastha, M.Tech, Ph.D.,
SERB National Postdoctoral Researcher
Genomics and Systems Biology Lab,
Department of Chemical Engineering,
Indian Institute of Technology Bombay (IITB),
Powai, Mumbai- 400076

	[[alternative HTML version deleted]]