Tuesday, July 19, 2011

Artificial Intelligence (K Nearest Neighbor) in OPENCV

In pattern recognition, the k-nearest neighbor algorithm (k-NN) is a method for classifying objects based on closest training examples in the feature space. k-NN is a type of instance-based learning, or lazy learning where the function is only approximated locally and all computation is deferred until classification. The k-nearest neighbor algorithm is amongst the simplest of all machine learning algorithms: an object is classified by a majority vote of its neighbors, with the object being assigned to the class most common amongst its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of its nearest neighbor.

The k-NN algorithm can also be adapted for use in estimating continuous variables. One such implementation uses an inverse distance weighted average of the k-nearest multivariate neighbors. This algorithm functions as follows:
  1. Compute Euclidean or Mahalanobis distance from target plot to those that were sampled.
  2. Order samples taking for account calculated distances.
  3. Choose heuristically optimal k nearest neighbor based on RMSE done by cross validation technique.
  4. Calculate an inverse distance weighted average with the k-nearest multivariate neighbors.
Using a weighted k-NN also significantly improves the results: the class (or value, in regression problems) of each of the k nearest points is multiplied by a weight proportional to the inverse of the distance between that point and the point for which the class is to be predicted.

class atsKNN{
public :
    void knn(cv::Mat& trainingData, cv::Mat& trainingClasses, cv::Mat& testData, cv::Mat& testClasses, int K)
        cv::KNearest knn(trainingData, trainingClasses, cv::Mat(), false, K);
        cv::Mat predicted(testClasses.rows, 1, CV_32F);
        for(int i = 0; i < testData.rows; i++) {
                const cv::Mat sample = testData.row(i);
                predicted.at<float>(i,0) = knn.find_nearest(sample, K);

        float percentage = evaluate(predicted, testClasses) * 100;
        cout << "K Nearest Neighbor Evaluated Accuracy = " << percentage << "%" << endl;
        prediction = predicted;
    void showplot(cv::Mat testData)
        plot_binary(testData, prediction, "Predictions Backpropagation");
    cv::Mat prediction;


the simplicity of the algorithm in classifying things requires you to provide alot of training samples to ensure lots of vectors are created within K. therefore it is not optimized for speed and space. in addition to consuming alot of memory it is really slow.


  1. Thanks for your post! It is realy informative. If a neighbor will have several parameters(variables), how will a code look like?

  2. Hey.. I need your help.. I want a code that detect the red color from an image using KNN classifier.

    Thank you