K-means++ - Wikipedia, the free encyclopedia
In data mining , k-means++ [1] [2] is an algorithm for choosing the initial values (or "seeds") for the k-means clustering algorithm. It was proposed in 2007 by David Arthur and Sergei Vassilvitskii, as an approximation algorithm for the NP-hard k-means problem—a way of avoiding the sometimes poor clusterings found by the standard k-means algorithm. It is similar to the first of three seeding methods proposed, in independent work, in 2006 [3] by Rafail Ostrovsky, Yuval Rabani, Leonard Schulman and Chaitanya Swamy. (The distribution of the first seed is different.) Contents However, the k-means algorithm has at least two major theoretic shortcomings: First, it has been shown that the worst case running time of the algorithm is super-polynomial in the input size. [5] Second, the approximation found can be arbitrarily bad with respect to the objective function compared to the optimal clustering.Read full article from K-means++ - Wikipedia, the free encyclopedia
No comments:
Post a Comment