logo

k-Nearest Neighbors

  • non-parametric

  • used for classification and regression

    • classification: An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors
    • regression: the output is the property value for the object. This value is the average of the values of its k nearest neighbors
  • instance-based learning, or lazy learning: the function is only approximated locally and all computation is deferred until classification.

  • can be useful to assign weight, e.g. 1/d as weight, where d is the distance to the neighbor

  • training: store feature vectors and class labels

  • classification: majority vote among k-nearest neighbors

  • drawback of majority voting: if the class distribution is skewed, the most frequent class tend to dominate the prediction.