SFR | e-Content Development Program

other classification methods

k-nearest neighbor classifier

•case-based reasoning

•Genetic algorithm

•Rough set approach

•Fuzzy set approaches

Instance-Based Methods

•Instance-based learning (or learning by ANALOGY):

–Store training examples and delay the processing (“lazy evaluation”) until a new instance must be classified

•Typical approaches

–k-nearest neighbor approach

•Instances represented as points in a Euclidean space.

–Locally weighted regression

•Constructs local approximation

–Case-based reasoning

•Uses symbolic representations and knowledge-based inference

The k-Nearest Neighbor Algorithm

•All instances correspond to points in the n-D space.

•The nearest neighbors are defined in terms of Euclidean distance.

•The target function could be discrete- or real- valued.

•For discrete-valued, the k-NN returns the most common value among the k training examples nearest to xq.

•Vonoroi diagram: the decision surface induced by 1-NN for a typical set of training examples.

Discussion on the k-NN Algorithm

•The k-NN algorithm for continuous-valued target functions

–Calculate the mean values of the k nearest neighbors

•Distance-weighted nearest neighbor algorithm

–Weight the contribution of each of the k neighbors according to their distance to the query point xq

•giving greater weight to closer neighbors

–Similarly, for real-valued target functions

•Robust to noisy data by averaging k-nearest neighbors

•Curse of dimensionality: distance between neighbors could be dominated by irrelevant attributes.

–To overcome it, axes stretch or elimination of the least relevant attributes.