KNN algorithm is a type of instance-based learning algorithm that classifies a new observation by comparing it with the closest observations in the dataset based on a similarity metric such as Euclidean distance an distance.

KNN can be used for both 🏷 Classification and 🔄 Regression tasks.

Notes

It is a Supervised Learning algorithm that can be used to classify or regress observations based on their proximity to other points in the dataset. The name comes from the idea that an observation will fall close to its k nearest neighbors.

Info

KNN is a predictive modeling technique

TakeAways

  • 🏷️ KNN works by comparing the new observation with the closest k observations in the dataset, then assigning it the category most common among those k neighbors.
  • 👭 The choice of k and the similarity metric (Euclidean distance, for example) can significantly impact the performance of KNN.

Process

  1. Choosing k: Select the number k, which represents the number of nearest neighbors to consider when making a prediction.
  2. Calculating Distances: Calculate the distances between the new observation and all observations in the dataset using the chosen similarity metric.
  3. Selecting Neighbors: Select the k observations that are closest to the new observation.
  4. Majority Voting or Averaging: Assign the category most common among the k nearest neighbors (for 🏷 Classification) or calculate their average (for 🔄 Regression).

Thoughts

  • 💡 Remember that the choice of k and similarity metric directly impacts KNN’s performance