ML0058 K-means++

Written by

Please explain how K-means++ works.

Answer

K-means++ is an improved way to initialize centroids in K-means. K-means++ selects initial centroids one by one using a weighted probability based on squared distances from already chosen centroids. This spreads out the centroids more effectively, reducing the chances of poor clustering and helping the algorithm converge faster and more reliably.

K-means++ Steps:
(1) Choose the first centroid $\mu_1$ uniformly at random from the dataset.
(2) For each point $x_i$ , compute its squared distance to the nearest chosen centroid:
$D(x_i)^2 = \min_{1 \le j \le m} |x_i - \mu_j|^2$
Where:
$\mu_j$ is one of the already chosen centroids.
(3) Choose the next centroid $\mu_{m+1}$ with probability:
$P(x_i) = \frac{D(x_i)^2}{\sum_j D(x_j)^2}$
Where:
$D(x_i)^2$ is the squared distance from point $x_i$ to its nearest already chosen centroid.
$\sum_j D(x_j)^2$ is the sum of minimum squared distances from all data points to their nearest chosen centroid.
(4) Repeat until $K$ centroids are chosen.
(5) Then proceed with standard K-means clustering.