[R] SVM accuracy question

Tue Sep 27 00:32:25 CEST 2011

Hi, I'm working with support vector machine for the classification 
purpose, and I have a problem about the accuracy of prediction.

I divided my data set in train (1/3 of enteire data set) and test (2/3 
of data set) using the "sample" function. Each time I perform the svm 
model I obtain different result, according with the result of the 
"sample" function. I would like to "stabilize" the performance of my 
analysis. To do this I used the "set.seed" function. Is there a better 
way to do this? Should I perform a bootstrap on my work-flow (sample and 
svm)?

Here is an example of my workflow:
### not to run
index <- 1:nrow(myData)
set.seed(23)
testindex <- sample(index, trunc(length(index)/3))
testset <- myData[testindex, ]
trainset <- myData[-testindex, ]

tune.svm()
svm.model <- svm(Factor ~ ., data = myData, cost = from tune.svm,
                  gamma = from tune.svm, cross= 10, subset= testset)
summary(svm.model)
predict(svm.model, testset)

Best
Riccardo