A support vector machine (SVM) is a concept in computer science for a set of related supervised learning methods that analyze data and recognize patterns, used for classification and regression analysis. The standard SVM takes a set of input data and predicts, for each given input, which of two possible classes the input is a member of, which makes the SVM a non-probabilistic binary linear classifier. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.
A Support Vector Machine (SVM) performs classification by constructing an N-dimensional hyperplane that optimally separates the data into two categories. SVM models are closely related to neural networks. In fact, a SVM model using a sigmoid kernel function is equivalent to a two-layer, perceptron neural network.
The accuracy of an SVM model is largely dependent on the selection of the model parameters.
In the parlance of SVM literature, a predictor variable is called an attribute, and a transformed attribute that is used to define the hyperplane is called a feature. The task of choosing the most suitable representation is known as feature selection. A set of features that describes one case (i.e., a row of predictor values) is called a vector. So the goal of SVM modeling is to find the optimal hyperplane that separates clusters of vector in such a way that cases with one category of the target variable are on one side of the plane and cases with the other category are on the other size of the plane. The vectors near the hyperplane are the support vectors.
The major strengths of SVM are the training is relatively easy. No local optimal, unlike in neural networks. It scales relatively well to high dimensional data and the trade-off between classifier complexity and error can be controlled explicitly. The weakness includes the need for a good kernel function
A Support Vector Machine (SVM) performs classification by constructing an N-dimensional hyperplane that optimally separates the data into two categories. SVM models are closely related to neural networks. In fact, a SVM model using a sigmoid kernel function is equivalent to a two-layer, perceptron neural network.
The accuracy of an SVM model is largely dependent on the selection of the model parameters.
In the parlance of SVM literature, a predictor variable is called an attribute, and a transformed attribute that is used to define the hyperplane is called a feature. The task of choosing the most suitable representation is known as feature selection. A set of features that describes one case (i.e., a row of predictor values) is called a vector. So the goal of SVM modeling is to find the optimal hyperplane that separates clusters of vector in such a way that cases with one category of the target variable are on one side of the plane and cases with the other category are on the other size of the plane. The vectors near the hyperplane are the support vectors.
#include<opencv2/ml/ml.hpp>
class atsSVM
{
public:
void svm(cv::Mat& trainingData, cv::Mat& trainingClasses, cv::Mat& testData, cv::Mat& testClasses) {
//SVM Parameters
CvSVMParams param = CvSVMParams();
param.svm_type = CvSVM::C_SVC;
param.kernel_type = CvSVM::RBF; //CvSVM::RBF, CvSVM::LINEAR ...
param.degree = 0; // for poly
param.gamma = 20; // for poly/rbf/sigmoid
param.coef0 = 0; // for poly/sigmoid
param.C = 7; // for CV_SVM_C_SVC, CV_SVM_EPS_SVR and CV_SVM_NU_SVR
param.nu = 0.0; // for CV_SVM_NU_SVC, CV_SVM_ONE_CLASS, and CV_SVM_NU_SVR
param.p = 0.0; // for CV_SVM_EPS_SVR
param.class_weights = NULL; // for CV_SVM_C_SVC
param.term_crit.type = CV_TERMCRIT_ITER +CV_TERMCRIT_EPS;
param.term_crit.max_iter = 1000;
param.term_crit.epsilon = 1e-6;
// SVM training (use train auto for OpenCV>=2.0)
CvSVM svm(trainingData, trainingClasses, cv::Mat(), cv::Mat(), param);
cv::Mat predicted(testClasses.rows, 1, CV_32F);
for(int i = 0; i < testData.rows; i++) {
cv::Mat sample = testData.row(i);
float x = sample.at<float>(0,0);
float y = sample.at<float>(0,1);
predicted.at<float>(i, 0) = svm.predict(sample);
}
float percentage = evaluate(predicted, testClasses) * 100;
cout << "Support Vector Machine Evaluated Accuracy = " << percentage << "%" << endl;
}
void showplot(cv::Mat testData)
{
plot_binary(testData, prediction, "Predictions SVM");
}
private:
cv::Mat prediction;
};
class atsSVM
{
public:
void svm(cv::Mat& trainingData, cv::Mat& trainingClasses, cv::Mat& testData, cv::Mat& testClasses) {
//SVM Parameters
CvSVMParams param = CvSVMParams();
param.svm_type = CvSVM::C_SVC;
param.kernel_type = CvSVM::RBF; //CvSVM::RBF, CvSVM::LINEAR ...
param.degree = 0; // for poly
param.gamma = 20; // for poly/rbf/sigmoid
param.coef0 = 0; // for poly/sigmoid
param.C = 7; // for CV_SVM_C_SVC, CV_SVM_EPS_SVR and CV_SVM_NU_SVR
param.nu = 0.0; // for CV_SVM_NU_SVC, CV_SVM_ONE_CLASS, and CV_SVM_NU_SVR
param.p = 0.0; // for CV_SVM_EPS_SVR
param.class_weights = NULL; // for CV_SVM_C_SVC
param.term_crit.type = CV_TERMCRIT_ITER +CV_TERMCRIT_EPS;
param.term_crit.max_iter = 1000;
param.term_crit.epsilon = 1e-6;
// SVM training (use train auto for OpenCV>=2.0)
CvSVM svm(trainingData, trainingClasses, cv::Mat(), cv::Mat(), param);
cv::Mat predicted(testClasses.rows, 1, CV_32F);
for(int i = 0; i < testData.rows; i++) {
cv::Mat sample = testData.row(i);
float x = sample.at<float>(0,0);
float y = sample.at<float>(0,1);
predicted.at<float>(i, 0) = svm.predict(sample);
}
float percentage = evaluate(predicted, testClasses) * 100;
cout << "Support Vector Machine Evaluated Accuracy = " << percentage << "%" << endl;
}
void showplot(cv::Mat testData)
{
plot_binary(testData, prediction, "Predictions SVM");
}
private:
cv::Mat prediction;
};
The major strengths of SVM are the training is relatively easy. No local optimal, unlike in neural networks. It scales relatively well to high dimensional data and the trade-off between classifier complexity and error can be controlled explicitly. The weakness includes the need for a good kernel function
thank you saharkiz
ReplyDeletehi saharkiz
ReplyDeletethank you for your sharing. I want to ask a question. how I can calculate confidence of predicted value. I guess you used a function which name is evaluate.is it true ?
thank you
hi saharkiz,
ReplyDeletethank you for this post. i am involved with English OCR app using opencv and c++. this post says about 2 possible classes, but i have 52 classes for both simple and capital letters. can you tell me how to change this code as a multi class SVM ?