K-means is a partitioning method. It requires a prespecified number of cluster (). The algorithm divides the data into that number of clusters, and then iterates until it converges to the ideal content in the clusters. This algorithm clusters by minimizing within-clusters sum of squares.

Formally, K-means clustering works by clustering items into clusters called , where . An item can only belong to one cluster, so the clusters are disjoint. Each item has a vector of features , and is the mean vector of the items in cluster . The items are assigned to clusters by minimizing the sum of squares

Step 1. Assign items randomly to the K clusters .

Step 2. Compute for each cluster .Note that is a vector the length of the number of features, and contains the mean for each feature in .

Step 3. Every item is reassigned to a cluster with the mean vector that is closest to that item’s value.

If there is not equal variance in the data, or one variable is on a much larger scale than others, that variable will influence K-means decisions the most. Thus, it is common to normalize all variables/features so they contribute equally.

Is is easy to see how this algorithm becomes K-mediods if the median is used instead of the mean. K-mediods is robust to outliers.