ML0057 K-means

Please explain how K-means works.

Answer

K-means is an iterative unsupervised algorithm that groups data into  K clusters by minimizing intra-cluster distances. It alternates between assigning points to the nearest centroid and updating centroids until convergence. It is fast and easy to implement, but sensitive to initialization and non-convex cluster shapes.

Goal of K-means
: Partition data into  K clusters by minimizing within-cluster variance.

K-means Steps:
(1) Initialization: Randomly choose  K centroids.
(2) Assignment step: Assign each point to the closest centroid using Euclidean distance, given by:
 d(x, c_k) = \sqrt{\sum_{i=1}^{n} (x_i - c_{k,i})^2}
Where:
 x is the data point.
 c_k is the cluster center.
(3) Update Step: Compute new cluster centers as the mean of all points assigned to that cluster by:
 c_k = \frac{1}{C_k} \sum_{x \in C_k} x
Where:
 C_k represents the set of points assigned to cluster  k .
(4) Convergence: Repeat assignment and update steps until cluster centers stabilize or a stopping criterion is met.

Below shows an example for K-means clustering.


Login to view more content


Did you solve the problem?

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *